首页> 外文OA文献 >A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness
【2h】

A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness

机译:缓存分区的硬件评估,以提高利用率和能效,同时保持响应能力

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Computing workloads often contain a mix of interactive, latency-sensitive foreground applications and recurring background computations. To guarantee responsiveness, interactive and batch applications are often run on disjoint sets of resources, but this incurs additional energy, power, and capital costs. In this paper, we evaluate the potential of hardware cache partitioning mechanisms and policies to improve efficiency by allowing background applications to run simultaneously with interactive foreground applications, while avoiding degradation in interactive responsiveness. We evaluate these tradeoffs using commercial x86 multicore hardware that supports cache partitioning, and find that real hardware measurements with full applications provide different observations than past simulation-based evaluations. Co-scheduling applications without LLC partitioning leads to a 10% energy improvement and average throughput improvement of 54% compared to running tasks separately, but can result in foreground performance degradation of up to 34% with an average of 6%. With optimal static LLC partitioning, the average energy improvement increases to 12% and the average throughput improvement to 60%, while the worst case slowdown is reduced noticeably to 7% with an average slowdown of only 2%. We also evaluate a practical low-overhead dynamic algorithm to control partition sizes, and are able to realize the potential performance guarantees of the optimal static approach, while increasing background throughput by an additional 19%.
机译:计算工作负载通常包含交互式,对延迟敏感的前台应用程序和循环后台计算。为了保证响应速度,交互式和批处理应用程序通常在不相交的资源集上运行,但这会产生额外的能源,电力和资本成本。在本文中,我们评估了硬件缓存分区机制和策略通过允许后台应用程序与交互式前台应用程序同时运行,同时避免交互响应性能下降而提高效率的潜力。我们使用支持高速缓存分区的商用x86多核硬件评估了这些折衷方案,发现与过去基于仿真的评估相比,具有完整应用程序的实际硬件测量提供了不同的观察结果。与单独运行的任务相比,没有LLC分区的协同调度应用程序可将能耗提高10%,将平均吞吐量提高54%,但可导致前台性能下降多达34%,平均下降6%。使用最佳的静态LLC分区,平均能耗提高到12%,平均吞吐量提高到60%,而最坏情况下的速度降低到7%,平均速度仅降低2%。我们还评估了一种实用的低开销动态算法来控制分区大小,并能够实现最佳静态方法的潜在性能保证,同时将背景吞吐量提高了19%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号